Arabic Roots Extraction Using Morphological Analysis

نویسندگان

  • Aymen Abu-Errub
  • Ashraf Odeh
  • Qusai Shambour
  • Osama Al-Haj Hassan
چکیده

The Arabic language is characterized by its rich and complex morphology based on root-pattern schemes. Root extraction is one of the most important topics in the context of natural language processing applications such as information retrieval, text processing, machine translation, speech tagging, etc. This paper presents a method to extract the trilateral roots of Arabic words, acting from the roots of three consonants, through the removal of the prefixes and the suffixes, and the use of a list of morphological weights. Experimental results based on a list of eleven different root inflections shows the effectiveness of the proposed method with a success rate of 94%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rule-based Approach for Arabic Root Extraction: New Rules to Directly Extract Roots of Arabic Words

Extracting word roots in Arabic language is very problematic due to the specific morphological and structural changes in the language. To address this problem, several techniques have been proposed. This paper continues the problem of identifying and exploiting relationship amongst Arabic letters for Arabic root extraction begun in [1]. Eight different rules that detect the root letters accordi...

متن کامل

An Improved Arabic Word’S roots Extraction method using n-Gram Technique

Arabic language is distinguished by its morphological richness, which forces the workers in the field of Arabic language Processing (i.e., information retrieval, document’s classification, text summarizing) to deal with many words that seem to be different but in reality they came from an identical root word. One of the methods to overcome this problem is to return the words to their roots. Thi...

متن کامل

A Markovian approach for arabic root extraction

In this paper, we present an Arabic morphological analysis system that assigns, for each word of an unvoweled Arabic sentence, a unique root depending on the context. The proposed system is composed of two modules. The first one consists of an analysis out of context. In this module, we segment each word of the sentence into its elementary morphological units in order to identify its possible r...

متن کامل

Enhancing Root Extractors Using Light Stemmers

The rise of Natural Language Processing (NLP) opened new possibilities for various applications that were not applicable before. A morphological-rich language such as Arabic introduces a set of features, such as roots, that would assist the progress of NLP. Many tools were developed to capture the process of root extraction (stemming). Stemmers have improved many NLP tasks without explicit know...

متن کامل

Unsupervised Induction of Arabic Root and Pattern Lexicons using Machine Learning

We describe an approach to building a morphological analyser of Arabic by inducing a lexicon of root and pattern templates from an unannotated corpus. Using maximum entropy modelling, we capture orthographic features from surface words, and cluster the words based on the similarity of their possible roots or patterns. From these clusters, we extract root and pattern lexicons, which allows us to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014